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[57] ABSTRACT 

A information processing system providing archive/backup 
support with privacy assurances by encrypting data stored 
thereby. Data generated on a source system is encrypted, the 
key used thereby is separately encrypted, and both the 
encrypted data and encrypted key are transmitted to and 
maintained by a data repository system. The repository 
system receives only the encrypted data and key, while the 
source system retains the ability to recover the key and in 
turn, the data. The source system is therefore assured of 
privacy and integrity of the archived data by retaining access 
control yet is relieved of the physical management of the 
warehousing medium. 

24 Claims, 3 Drawing Sheets 
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SECURE FILE ARCHIVE THROUGH 
ENCRYPTION KEY MANAGEMENT 

CROSS REFERENCE TO RELATED 
APPLICATIONS 

A claim of priority is made to U.S. Provisional Patent 
Application No. 60/037,597, entitled FILE COMPARISON 
FOR DATA BACKUP AND FILE SYNCHRONIZATION, 
filed Feb. 11, 1997. 

STATEMENT REGARDING FEDERALLY 
SPONSORED RESEARCH OR DEVELOPMENT 

Not Applicable 

FIELD OF THE INVENTION 

The present invention relates to data archive operations 
for information processing systems, and more particularly to 
security features for such operations. 

BACKGROUND OF THE INVENTION 

In an information processing system periodic archival of 
static, unused objects is desirable to optimize access to more 
active items and to guard against failure such as disk head 
crashes and human error such as accidental deletions. 
Consequently, periodic backups to magnetic tape and cor- 
responding purging of selected files from online disks is a 
common practice. 

Data archival mechanisms need to assure the integrity of 
data stored thereby. Users of the data need to know data is 
persistent, and also that there is a reasonable turnaround time 
for retrieval. Often this entails copying such data entities, 
hereinafter files, to an inexpensive, high volume, but not 
necessarily fast access, form of physical storage such as 
magnetic tape. Corresponding index information regarding 
the magnetic tape location of a particular file can be retained 
online. Since index information referencing a file consumes 
much less storage than the file itself, such information is not 
as unwieldy as the actual data file counterpart. In order to 
retrieve a file, the index is consulted to determine the 
physical volume of the corresponding file. The physical 
magnetic tape volume is then searched for the desired entity. 
Although sequential, this aspect of the search can be per- 
formed within a reasonable time since the indexing system 
has narrowed the field to a single volume. Such indexing 
schemes are numerous and are well known to those skilled 
in the art. 

Images written to magnetic tape, however, remain fixed 
and readable unless physically overwritten. Successive revi- 
sions of backups tend to render the previous versions 
obsolete, although the earlier versions still exist on the tape. 
Such a tape might well be discarded, thereby placing it in the 
public domain, or partially used for another purpose, leaving 
an uncertain status of the information which may exist 
randomly and unprotected. Further attenuation of control 
over the data occurs when another parly performs the 
archive. Since the archiving operation usually bears little 
relation to the generation of the data, it is often desirable to 
delegate this operation. The archive operation may be under- 
taken by a co-located group, a group at a remote location of 
the same organization, or an external contractor, and could 
involve either electronic or physical mediums of data trans- 
mission. Delegation of the backup operation to an archive 
server, however, raises issues of security and privacy, since 
the corporation or individual generating the data (hereinafter 
source organization) has little control over access to the data 
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at a remote facility. With regard to file deletion, however, 
magnetic tape does not lend itself well to selective rewrite. 
Due to the sequential nature of magnetic tape, intra-tape 
modifications can compromise subsequent files. It is thcrc- 
5 fore difficult for an archive service to ensure integrity of data 
upon retrieval requests, provide effective deletion of obso- 
lete data, and maintain secrecy of data while under the 
control of the archive mechanism. 

io BRIEF SUMMARY OF THE INVENTION 

The present invention addresses the problem of privacy 
for archived data by providing the source organization with 
control over the data without burdening the reliability of 
retrieval with the problems caused by sequential overwrite. 

15 An encryption function applied to the archived data renders 
it in a form unintelligible to unauthorized observers. Encryp- 
tion involves arithmetic manipulations of the data using a 
specific value called a key, which renders the data in an 
unintelligible form. This key bears a specific mathematical 

20 relationship to the data and the encryption algorithm being 
used. Returning the data to the original form involves 
applying the corresponding inverse function to the encrypted 
form. Without the proper key, however, it is very difficult to 
determine the inverse, or decryption, function. The security 

25 provided by encryption rests on the premise that with a 
sufficiently large key, substantial computational resources 
are required to determine the original data. Encrypting a file 
with a particular key, and then encrypting the key itself using 
a master key, therefore, allows another party to physically 

30 maintain and store the data while the originator, or source, 
of the data retains access control. Additional security and 
authentication measures can also be taken, such as further 
encrypting the key or the data at the server with a server key, 
and the use of cipher block chaining to impose dependencies 

35 among a sequence of file blocks. 

In accordance with the present invention, an archive 
server utilizes encryption techniques to maintain both secu- 
rity and integrity of stored data by maintaining a series of 

40 keys for each archived file, and encrypting both the archived 
file, and the key to which it corresponds. The archive server 
manages the encrypted files and the corresponding 
encrypted keys, while the source organization maintains 
only the master key required to recover the individual 

45 encrypted keys. Through this arrangement, the source orga- 
nization maintains control and assurances over access to the 
archived data, while the archive server manages the physical 
storage medium and performs individual encrypted file 
manipulation requests at the behest of the client. The archive 

5Q server maintains access only to the encrypted data files and 
encrypted keys, effectively managing these files and keys as 
abstract black-box entities, without the ability to examine 
and interpret the contents. 

Three common transactions involving archived encrypted 

55 files are effected by the present invention. A source organi- 
zation desiring to archive files periodically transfers files 
from its online repository, usually a fast access storage 
medium such as a disk, to the archive server. To retrieve 
archived information, a retrieval transaction indicating a 

60 particular file occurs. Finally, when an item is to be deleted, 
a deletion instruction implicating a particular file is issued to 
the archive server. 

One benefit provided by this arrangement is the elimina- 
tion of access to data by the archive server, therefore 

65 providing the source organization with assurances of access 
control and privacy, while relieving the source organization 
of archive cataloging and physical storage duties. 
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Furthermore, effective deletion of information stored on 
archive tapes is achieved without physical modification to 
magnetic tape, therefore avoiding compromise to subse- 
quent data on the same volume. 

BRIEF DESCRIPTION OF THE SEVERAL 
VIEWS OF THE DRAWING 

The invention will be more fully understood in view of the 
following Detailed Description of the Invention and 
Drawing, of which: 

FIG. 1 is a block diagram of the physical information 
flow; 

FIG. 2 is a flowchart depicting the archival method; and 
FIG. 3 is a flowchart depicting the retrieval method. 

DETAILED DESCRIPTION OF THE 
INVENTION 

U.S. Provisional Patent Application No. 60/037,597 
entitled FILE COMPARISON FOR DATA BACKUP AND 
FILE SYNCHRONIZATION, filed Feb. 11, 1997, is incor- 
porated herein by reference. 

Referring to FIG. 1, in a computer information processing 
system large amounts of data are stored and must periodi- 
cally be archived. Often data is copied from a source system 
8 to an archive information processing system 30, herein- 
after archive server, over a transmission medium, 26 & 28. 
The archive server 30 then copies the data to be archived 
onto a suitable long term storage volume such as magnetic 
tape 36. 

An archive transaction for a file stored at the source 
system encompasses encryption of the file on the source 
system using a secondary key, encryption of the secondary 
key on the source system using a master key, and transmis- 
sion of the encrypted file and the associated encrypted key 
to the archive server. Transmission is electronic via com- 
puter network, or in alternative embodiments by physical 
delivery of a suitable magnetic medium. The archive server 
then stores the encrypted file on magnetic tape or another 
medium of long term storage, and stores the encrypted key 
along with an index to the tape containing the encrypted file. 
The master key used to encrypt the secondary key is retained 
on the source system. 

Referring to FIGS. 1 and 2, A file 10 to be archived is 
identified 100 within a fast access storage medium 12 of the 
source information system 8, and is sent to a cryptographic 
engine 14. The present embodiment incorporates a disk 
drive as the fast access storage medium, although an alter- 
native embodiment could use other modes of digital fixation, 
such as CD-ROM. The cryptographic engine 14 may be an 
application within the same node or an independent CPU, 
and may invoke specialized encryption hardware, depending 
on the encryption method desired. Any of various known 
encryption methods could be employed. 

A key generator 16 then generates a secondary key 18 as 
shown in step 102, and uses this key to encrypt the file 10 
as shown in step 104 to produce an encrypted file 20, at step 
106. The master encryption key 22 is then obtained in step 
108 and used to encrypt the secondary key in 18, as shown 
at step 110, and produce an encrypted key 24, as indicated 
in step 112. Note that since the same master key is used to 
encrypt multiple secondary keys it need be generated only 
once and then reused for successive secondary keys. The 
encrypted file 20 and encrypted key 24 are then transmitted 
to the archive server at steps 116 and 118, respectively, while 
the master key 22 is retained at the source system 8 at step 
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114. Transmission may be accomplished via Internet 26, 
dialup connection 28, or in alternative embodiments, other 
means such as physical delivery of the storage medium. 
Encryption may be performed by any of various know n 

5 me thods, such as RSA, PES, pflfl pthpr permutat ions and 
may mvolve aut hentication and verification either through a 
tiustcd third parl y or mathematical methods . Su ch authen- 
tication a nd venncation may invoive_cipher block chaining^ 
XlBC), lb perform an X6ft on all or part of a previous block 

10 aud use llu: IC S uTtSnf value irrencryptrng a successive block , 
ofT treCksums such as cyclic redundancy checks (CRC) , 
ML>4, and" MD5, which accumulate all values in a particular 
block according to a mathematical formula to arrive at a 
value which is highly unlikely to be duplicated if data in the 

15 block is changed or lost. 

Upon receipt of the encrypted file 20 and the encrypted 
key 24, the archive server 30 writes the encrypted file 32 to 
a magnetic tape 36, or other medium of long term storage 
which is inexpensive and which need not encompass real 

20 time access, via tape drive 34 at step 120. The encrypted key 
38 is then written to a tape index disk file 40 at step 122, 
thereby associating the magnetic tape volume 36 with the 
encrypted file 32 and the encrypted key 38. In alternative 
embodiments, a further encryption operation may be per- 

25 formed at the archive server on the encrypted file 32 or the 
encrypted key 38 to add an additional layer of security. 

Recovery of a file is accomplished by the archive server 
referencing the index to obtain the encrypted key and the 
volume of the encrypted file. The encrypted file is then 

30 retrieved from the volume, and both the encrypted file and 
encrypted key are transmitted back to the client. The client 
then recovers the file through the same two stage process 
used to encrypt. First, the secondary key must be recovered 
by decrypting the encrypted key with the master. Second, the 

35 original file may be recovered by decrypting the encrypted 
file with the secondary key. 

Referring to FIGS. 1 and 3, for file recovery the archive 
server searches the tape index disk file 40 at step 200 to 

^ lookup the encrypted key 44 and the location of the magnetic 
tape volume 36. The server then retrieves the encrypted key 
at step 202 and retrieves the encrypted file 42 from long term 
storage via tape drive 34, as shown in step 204. The 
encrypted file 48 and encrypted key 46 are then transmitted 

45 back to the source system 8 as indicated by steps 206 and 
208, respectively. 

Once received by the source system 8, the master key 22 
is used to decrypt the encrypted key 46 at step 210 and 
recover the secondary key 18, as shown in step 212. The 

50 secondary key 18 is then used to decrypt the encrypted file 
48 as shown in step 214 to produce the recovered file 50 
which is identical to the original file 10, as indicated by step 
216. 

File deletion involves searching the tape index disk file 
55 40, for the entry corresponding to the file 10 marked for 
deletion. Rather than retrieving the key and volume, 
however, the encrypted key 44 is deleted and the storage 
area in the tape index disk file 40 overwritten with zero 
values. This overwriting is required to avoid future access to 
60 the encrypted key 44 through use of a sector level disk 
access, as many file systems merely flag a deleted area as 
available, and data physically remains unaltered until a 
subsequent write needs the available space. Elimination of 
the encrypted key effectively precludes future access to the 
65 contents of the archived file stored on magnetic tape without 
requiring physical modification to the archive volume; only 
the encrypted key is deleted. Therefore, there is no compro- 
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mise of the integrity of adjacent entities on the tape, and no from the group consisting of electronic, magnetic, and 

extraneous versions of sensitive data. optical storage media. 

Following overwrite of the encrypted key 44, the infor- 3. The network as in claim 1 wherein said first memory 

mation in the encrypted file 32 remains secure. No modifi- comprises a substantially real-time random access storage 

cation of the magnetic tape volume 36 is required, as the 5 metfum. 

encryption ensures that the information remains unintelli- 4 ^ ™twork*s in claim 1 wherein said second memory 

ojuig comprises a first and second storage area, said first storage 

' . _ . . , , . area comprising substantially real-time random access stor- 

Effectiveness of this method suggests that the encryption medi ^ sM s&cqq6 e afea ^ hi h . 

take place no more remotely than the limits of the source volume Q wherein g c d aQd d afe ^ 

system organization s proprietary, or internal, network, as d ded b . Qf iaformation stored thereb 

unprotected electromc transfers can also compromise the 5 ^ nctWQ± ^ [q daim 4 wherem said hi h . volume 

data. The dotted line 52 on FIG. 1 indicates the extent of st0 rage is comprised of detachable physical volumes capable 

unencrypted data and should represent no greater extent than of selective m6 repeatable communication with said archive 

the intranet of the originating entity. smet process i Dg system . 

Master key generation is significant because recovery of 6 xh e ne twork as in claim 4 wherein said at least one 

a key allows recovery of the file that the key represents. encrypted key is stored in said first storage area within said 

Consequently, control over access and deletion to archived second memory and said encrypted data units are stored in 

files is dependent upon control over the corresponding s;uc j second storage area within said second memory, 

secondary keys. Each key, however, must be unique to the 7 ^ ne twork as in claim 1 wherein said data units 

file to which it corresponds, otherwise, exposure of a key to comprise elements of a file system, 

decrypt a particular file compromises that key for all other g Th e network as in claim 1 wherein said data units 

files which that key covers. If the source system is required comprise a discrete and enumerable area within said first 

to maintain a separate key for all archived encrypted files, memory 

however, there is merely a shift in storage medium, as the 9 The network as in claim 1 wherein said source infor- 

key to each encrypted file, rather than the file, must be still mat i 0 n processing system further comprises a computer and 

be maintained. Encrypting individual secondary keys allows said encryption engine is implemented by said computer 

the keys to be maintained as securely as the files. The source executing an encryption application having said master 

system maintains a single master key, or several master keys encryption key, said at least one secondary key, and said data 

covering different groups of secondary keys. Control of the 3q umts as ; nputs said encrypted data units and said at least 

archived, encrypted files is then focused through a master one encryp ted key as outputs. 

key. The archiving entity retains a set of all encrypted files, 10 . The network as in claim 1 wherein said source 

and maintains a mapping to the corresponding encrypted information processing system further comprises a computer 

keys for which the source organization holds the master key. and ^id encryption engine is implemented by a circuit in 

Having described the preferred embodiments of the ^ communication with said computer, said circuit having said 

invention, other embodiments which incorporate concepts of master encryption key, said at least one secondary encryp- 

the invention will now become apparent to one skilled in the tion key, and said data units as inputs and said encrypted data 

art. Therefore, the invention should not be viewed as limited units and said at least one encrypted key as outputs, 

to the disclosed embodiments but rather should be viewed as 11. The network as in claim 1 further comprising a 

limited only by the spirit and scope of the appended claims. 4Q plurality of said source information processing systems 

What is claimed is: electrically connected lo said archive server information 

1. An electronic network for transferring data units among processing system. 

storage elements comprising: 12. The network as in claim 1 wherein said data units 

a communications link; comprise subdivisions comprising a plurality of blocks and 

a source information processing system at a first end of 45 sa id encryption is applied to said blocks wherein input to 

said communications link further comprising: said encryption includes values from said plurality of blocks 

a master encryption key; a °d the results of at least one previous encrypted block, 

at least one secondary encryption key; 13 ^ electronic network for transferring data units 

a first memory for storing data units and said master among storage elements comprising: 

and said at least one secondary encryption keys; and 50 a communications link; 

an encryption engine for selectively encrypting said a source information processing system at a first end of 

data units to produce encrypted data units using at said communications link further comprising: 

least one of said secondary encryption keys, and for a master encryption key; 

encrypting said at least one secondary encryption at least one secondary encryption key; 

key with said master encryption key producing at 55 a first memory for storing data units and said master 

least one encrypted key; and and said at least one secondary encryption keys; and 

an archive server information processing system having at an encryption engine for selectively encrypting said 

least one archive server key at a second end of said data units to produce encrypted data units using at 

communications link comprising a second memory and least one of said secondary encryption keys, and for 

in communication with said source information pro- 60 encrypting said at least one secondary encryption 

cessing system, said archive server information pro- key with said master encryption key producing at 

cessing system for receiving and storing said encrypted least one encrypted key; and 

data units and said encrypted keys in said second an archive server information processing system having at 

memory wherein said archive server key is used to least one archive server key at a second end of said 

further encrypt said encrypted keys. 65 communications link comprising a second memory and 

2. The network as in claim 1 wherein said first and said in communication with said source information pro- 
second memories provide fixation in a medium selected cessing system, said archive server information pro- 
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cessing system for receiving and storing said encrypted 
data units and said encrypted keys in said second 
memory wherein said archive server key is used to 
further encrypt said encrypted data units. 

14. A method for providing secure archive for data 
generated in a first memory within a source information 
processing system comprising the steps of: 

identifying data for archive within said first memory; 

obtaining a secondary encryption key; 

encrypting said data with said secondary encryption key 
to produce encrypted data; 

obtaining a master encryption key; 

encrypting said secondary encryption key with said mas- 
ter encryption key to produce an encrypted key; 

transmitting said encrypted data and encrypted key to an 
archive information system having a second memory; 

writing said encrypted data and said encrypted key to said 
second memory; and 

overwriting the portion of said second memory where said 
encrypted key is stored. 

15. The method according to claim 14 wherein the step of 
transmitting comprises sending via electromagnetic 
medium. 

16. The method according to claim 14 wherein the step of 
transmitting is selected from the group consisting of trans- 
mitting via electronic network communications and trans- 
mitting via dedicated telephone modem connection. 

17. The method according to claim 14 wherein the step of 
identifying data for archive is comprised of demarcating an 
enumerated area within said first memory. 

18. The method according to claim 14 wherein the step of 
identifying data in first memory comprises locating infor- 
mation from fixation in a medium selected from the group 
consisting of magnetic, electronic and optical. 

19. The method according to claim 14 wherein the step of 
writing to second memory consists of fixation in a medium 
selected from the group consisting of magnetic, electronic 
and optical. 

20. The method according to claim 14 wherein said data 
is subdivided into a plurality of blocks and input to said 
encrypting includes the results of at least one previous 
encrypting of said blocks. 

21. A method for providing secure archive for data 
generated in a first memory within a source information 
processing system comprising the steps of: 

identifying data for archive within said first memory; 

obtaining a secondary encryption key; 

encrypting said data with said secondary encryption key 
to produce encrypted data; 

obtaining a master encryption key; 

encrypting said secondary encryption key with said mas- 
ter encryption key to produce an encrypted key; 

transmitting said encrypted data and encrypted key to an 
archive information system having a second memory 
and an archive server encryption key; 

further encrypting said encrypted key with said archive 
server encryption key; 

writing said encrypted data and said encrypted key to said 
second memory. 

22. A method for providing secure archive for data 
generated in a first memory within a source information 
processing system comprising the steps of: 

identifying data for archive within said first memory; 

obtaining a secondary encryption key; 

encrypting said data with said secondary encryption key 

to produce encrypted data; 
obtaining a master encryption key; 
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encrypting said secondary encryption key with said mas- 
ter encryption key to produce an encrypted key; 
transmitting said encrypted data and encrypted key to an 
archive information system having a second memory 
5 and an archive server encryption key; 

further encrypting said encrypted data with said archive 

server encryption key; 
writing said encrypted data and said encrypted key to said 
second memory. 
10 23. A method for providing secure archive for data 
generated in a first memory within a source information 
processing system comprising the steps of: 

identifying data for archive within said first memory; 
obtaining a secondary encryption key; 
encrypting said data with said secondary encryption key 

to produce encrypted data; 
obtaining a master encryption key; 
encrypting said secondary encryption key with said mas- 
20 ter encryption key to produce an encrypted key; 

transmitting said encrypted data and encrypted key to an 
archive information system having a second memory 
and an archive server encryption key; 
writing said encrypted data and said encrypted key to said 
25 second memory 

retrieving said encrypted data and said encrypted key 
from said second memory of said archive information 
system; 

decrypting said encrypted key with said archive server 
30 encryption key; 

transmitting said encrypted data and said encrypted key 
from said archive information system to said source 
information processing system; 
decrypting said encrypted key with said master encryption 
35 key to recover said secondary key; and 

decrypting said encrypted data with said secondary key to 

recover said data. 
24. A method for providing secure archive for data 
generated in a first memory within a source information 
40 processing system comprising the steps of: 

identifying data for archive within said first memory; 
obtaining a secondary encryption key; 
encrypting said data with said secondary encryption key 
45 to produce encrypted data; 

obtaining a master encryption key; 
encrypting said secondary encryption key with said mas- 
ter encryption key to produce an encrypted key; 
transmitting said encrypted data and encrypted key to an 
50 archive information system having a second memory 
and an archive server encryption key; 
writing said encrypted data and said encrypted key to said 

second memory; 
retrieving said encrypted data and said encrypted key 
55 from said second memory of said archive information 
system; 

decrypting said encrypted data with said archive server 

encryption key; 
transmitting said encrypted data and said encrypted key 
60 from said archive information system to said source 
information processing system; 
decrypting said encrypted key with said master encryption 

key to recover said secondary key; and 
decrypting said encrypted data with said secondary key to 
65 recover said data. 
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